Training and delayed reinforcements in Q-learning agents
نویسندگان
چکیده
Q-learning can greatly improve its convergence speed if helped by immediate reinforcements provided by a trainer able to judge the usefulness of actions as stage setting with respect to the goal of the agent. This paper experimentally investigates this hypothesis studying the integration of immediate reinforcements (also called training reinforcements) with standard delayed reinforcements (namely, reinforcements assigned only when the agent-environment relationship reaches a peculiar state, such as when the agent reaches a target). The paper proposes two new algorithms (TL and MTL) able to exploit even locally wrong and misleading training reinforcements. The proposed algorithms are tested against Q-learning and other algorithms (AB-LEC and BB-LEC) described in the literature1 which also make use of training reinforcements. Experiments are run in a grid world where a Q-agent, a simple simulated robot, must learn to reach a target. Accepted for publication in International Journal of Intelligent Systems, 1997. In press.
منابع مشابه
Multigrid Q-learning
Reinforcement learning scales poorly when reinforcements are delayed. The problem of propagating information from delayed reinforcements to the states and actions that have an e ect the reinforcement is similar to the problem of propagating information in a discretized boundary value problem. Multigrid methods have been shown to decrease the number of updates required to solve boundary value pr...
متن کاملGenetic Encoding of Agent Behavioral Strategy
The general framework tackled in this paper is the automatic generation of intelligent collective behaviors using genetic programming and reinforcement learning. We define a behavior-based system relying on automatic design process using artificial evolution to synthesize high level behaviors for autonomous agents. Behavioral strategies are described by tree-based structures, and manipulated by...
متن کاملTwelfth National Conference on Arti cial Intelligence ( AAAI - 94 ) . Incorporating Advice into Agents that Learn from Reinforcements
Incorporating Advice into Agents that Learn from Reinforcements Richard Maclin Jude W. Shavlik Computer Sciences Dept., University of Wisconsin 1210 West Dayton Street Madison, WI 53706 Email: fmaclin,[email protected] Abstract Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training epis...
متن کاملIncorporating Advice into Agents that Learn from Reinforcements
Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present an approach that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer. In our approach, the advice-giver watches the ...
متن کاملThe Role of the Trainer in Reinforcement Learning
In this paper we propose a three-stage incremental approach to the development of autonomous agents. We discuss some issues about the characteristics which differentiate reinforcement programs (RPs), and define the trainer as a particular kind of RP. We present a set of results obtained running experiments with a trainer which provides guidance to the AutonoMouse, our mouse-sized autonomous rob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Intell. Syst.
دوره 12 شماره
صفحات -
تاریخ انتشار 1997